A Novel Approach for Choosing Summary Statistics in Approximate Bayesian Computation
نویسندگان
چکیده
The choice of summary statistics is a crucial step in approximate Bayesian computation (ABC). Since statistics are often not sufficient, this choice involves a trade-off between loss of information and reduction of dimensionality. The latter may increase the efficiency of ABC. Here, we propose an approach for choosing summary statistics based on boosting, a technique from the machine-learning literature. We consider different types of boosting and compare them to partial least-squares regression as an alternative. To mitigate the lack of sufficiency, we also propose an approach for choosing summary statistics locally, in the putative neighborhood of the true parameter value. We study a demographic model motivated by the reintroduction of Alpine ibex (Capra ibex) into the Swiss Alps. The parameters of interest are the mean and standard deviation across microsatellites of the scaled ancestral mutation rate (θ(anc) = 4N(e)u) and the proportion of males obtaining access to matings per breeding season (ω). By simulation, we assess the properties of the posterior distribution obtained with the various methods. According to our criteria, ABC with summary statistics chosen locally via boosting with the L(2)-loss performs best. Applying that method to the ibex data, we estimate θ(anc)≈ 1.288 and find that most of the variation across loci of the ancestral mutation rate u is between 7.7 × 10(-4) and 3.5 × 10(-3) per locus per generation. The proportion of males with access to matings is estimated as ω≈ 0.21, which is in good agreement with recent independent estimates.
منابع مشابه
DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression
Performing exact posterior inference in complex generative models is often difficult or impossible due to an expensive to evaluate or intractable likelihood function. Approximate Bayesian computation (ABC) is an inference framework that constructs an approximation to the true likelihood based on the similarity between the observed and simulated data as measured by a predefined set of summary st...
متن کاملApproximate Bayesian Computation: a nonparametric perspective
Approximate Bayesian Computation is a family of likelihood-free inference techniques that are tailored to models defined in terms of a stochastic generating mechanism. In a nutshell, Approximate Bayesian Computation proceeds by computing summary statistics from the data and giving more weight to the values of the parameters for which the simulated summary statistics resemble the observed ones. ...
متن کاملSelecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models
Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the "go-to" option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge wit...
متن کاملKernel approximate Bayesian computation in population genetic inferences.
Approximate Bayesian computation (ABC) is a likelihood-free approach for Bayesian inferences based on a rejection algorithm method that applies a tolerance of dissimilarity between summary statistics from observed and simulated data. Although several improvements to the algorithm have been proposed, none of these improvements avoid the following two sources of approximation: 1) lack of sufficie...
متن کاملLocal Kernel Dimension Reduction in Approximate Bayesian Computation
Approximate Bayesian Computation (ABC) is a popular sampling method in applications involving intractable likelihood functions. Without evaluating the likelihood function, ABC approximates the posterior distribution by the set of accepted samples which are simulated with parameters drown from the prior distribution, where acceptance is determined by distance between the summary statistics of th...
متن کامل